217 research outputs found

    The PathOlogist: an automated tool for pathway-centric analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The PathOlogist is a new tool designed to transform large sets of gene expression data into quantitative descriptors of pathway-level behavior. The tool aims to provide a robust alternative to the search for single-gene-to-phenotype associations by accounting for the complexity of molecular interactions.</p> <p>Results</p> <p>Molecular abundance data is used to calculate two metrics - 'activity' and 'consistency' - for each pathway in a set of more than 500 canonical molecular pathways (source: Pathway Interaction Database, <url>http://pid.nci.nih.gov</url>). The tool then allows a detailed exploration of these metrics through integrated visualization of pathway components and structure, hierarchical clustering of pathways and samples, and statistical analyses designed to detect associations between pathway behavior and clinical features.</p> <p>Conclusions</p> <p>The PathOlogist provides a straightforward means to identify the functional processes, rather than individual molecules, that are altered in disease. The statistical power and biologic significance of this approach are made easily accessible to laboratory researchers and informatics analysts alike. Here we show as an example, how the PathOlogist can be used to establish pathway signatures that robustly differentiate breast cancer cell lines based on response to treatment.</p

    Using ILP to Identify Pathway Activation Patterns in Systems Biology

    Get PDF
    We show a logical aggregation method that, combined with propositionalization methods, can construct novel structured biological features from gene expression data. We do this to gain understanding of pathway mechanisms, for instance, those associated with a particular disease. We illustrate this method on the task of distinguishing between two types of lung cancer; Squamous Cell Carcinoma (SCC) and Adenocarcinoma (AC). We identify pathway activation patterns in pathways previously implicated in the development of cancers. Our method identified a model with comparable predictive performance to the winning algorithm of a recent challenge, while providing biologically relevant explanations that may be useful to a biologist

    Mining Biological Pathways Using WikiPathways Web Services

    Get PDF
    WikiPathways is a platform for creating, updating, and sharing biological pathways [1]. Pathways can be edited and downloaded using the wiki-style website. Here we present a SOAP web service that provides programmatic access to WikiPathways that is complementary to the website. We describe the functionality that this web service offers and discuss several use cases in detail. Exposing WikiPathways through a web service opens up new ways of utilizing pathway information and assisting the community curation process

    Functional Correlations of Pathogenesis-Driven Gene Expression Signatures in Tuberculosis

    Get PDF
    Tuberculosis remains a major health threat and its control depends on improved measures of prevention, diagnosis and treatment. Biosignatures can play a significant role in the development of novel intervention measures against TB and blood transcriptional profiling is increasingly exploited for their rational design. Such profiles also reveal fundamental biological mechanisms associated with the pathology of the disease. We have compared whole blood gene expression in TB patients, as well as in healthy infected and uninfected individuals in a cohort in The Gambia, West Africa and validated previously identified signatures showing high similarities of expression profiles among different cohorts. In this study, we applied a unique combination of classical gene expression analysis with pathway and functional association analysis integrated with intra-individual expression correlations. These analyses were employed for identification of new disease-associated gene signatures, identifying a network of Fc gamma receptor 1 signaling with correlating transcriptional activity as hallmark of gene expression in TB. Remarkable similarities to characteristic signatures in the autoimmune disease systemic lupus erythematosus (SLE) were observed. Functional gene clusters of immunoregulatory interactions involving the JAK-STAT pathway; sensing of microbial patterns by Toll-like receptors and IFN-signaling provide detailed insights into the dysregulation of critical immune processes in TB, involving active expression of both pro-inflammatory and immunoregulatory systems. We conclude that transcriptomics (i) provides a robust system for identification and validation of biosignatures for TB and (ii) application of integrated analysis tools yields novel insights into functional networks underlying TB pathogenesis

    Constitutive gene expression profile segregates toxicity in locally advanced breast cancer patients treated with high-dose hyperfractionated radical radiotherapy

    Get PDF
    Breast cancer patients show a wide variation in normal tissue reactions after radiotherapy. The individual sensitivity to x-rays limits the efficiency of the therapy. Prediction of individual sensitivity to radiotherapy could help to select the radiation protocol and to improve treatment results. The aim of this study was to assess the relationship between gene expression profiles of ex vivo un-irradiated and irradiated lymphocytes and the development of toxicity due to high-dose hyperfractionated radiotherapy in patients with locally advanced breast cancer. Raw data from microarray experiments were uploaded to the Gene Expression Omnibus Database (GEO accession GSE15341). We obtained a small group of 81 genes significantly regulated by radiotherapy, lumped in 50 relevant pathways. Using ANOVA and t-test statistical tools we found 20 and 26 constitutive genes (0 Gy) that segregate patients with and without acute and late toxicity, respectively. Non-supervised hierarchical clustering was used for the visualization of results. Six and 9 pathways were significantly regulated respectively. Concerning to irradiated lymphocytes (2 Gy), we founded 29 genes that separate patients with acute toxicity and without it. Those genes were gathered in 4 significant pathways. We could not identify a set of genes that segregates patients with and without late toxicity. In conclusion, we have found an association between the constitutive gene expression profile of peripheral blood lymphocytes and the development of acute and late toxicity in consecutive, unselected patients. These observations suggest the possibility of predicting normal tissue response to irradiation in high-dose non-conventional radiation therapy regimens. Prospective studies with higher number of patients are needed to validate these preliminary results

    Clinical, ultrasound and molecular biomarkers for early prediction of large for gestational age infants in nulliparous women: an international prospective cohort study

    Get PDF
    Objective: To develop a prediction model for term infants born large for gestational age (LGA) by customised birthweight centiles. Methods: International prospective cohort of nulliparous women with singleton pregnancy recruited to the Screening for Pregnancy Endpoints (SCOPE) study. LGA was defined as birthweight above the 90th customised centile, including adjustment for parity, ethnicity, maternal height and weight, fetal gender and gestational age. Clinical risk factors, ultrasound parameters and biomarkers at 14–16 or 19–21 weeks were combined into a prediction model for LGA infants at term using stepwise logistic regression in a training dataset. Prediction performance was assessed in a validation dataset using area under the Receiver Operating Characteristics curve (AUC) and detection rate at fixed false positive rates. Results: The prevalence of LGA at term was 8.8% (n = 491/5628). Clinical and ultrasound factors selected in the prediction model for LGA infants were maternal birthweight, gestational weight gain between 14–16 and 19–21 weeks, and fetal abdominal circumference, head circumference and uterine artery Doppler resistance index at 19–21 weeks (AUC 0.67; 95%CI 0.63–0.71). Sensitivity of this model was 24% and 49% for a fixed false positive rate of 10% and 25%, respectively. The addition of biomarkers resulted in selection of random glucose, LDL-cholesterol, vascular endothelial growth factor receptor-1 (VEGFR1) and neutrophil gelatinase-associated lipocalin (NGAL), but with minimal improvement in model performance (AUC 0.69; 95%CI 0.65–0.73). Sensitivity of the full model was 26% and 50% for a fixed false positive rate of 10% and 25%, respectively. Conclusion: Prediction of LGA infants at term has limited diagnostic performance before 22 weeks but may have a role in contingency screening in later pregnancy

    Nano Random Forests to mine protein complexes and their relationships in quantitative proteomics data

    Get PDF
    Ever-increasing numbers of quantitative proteomics data sets constitute an underexploited resource for investigating protein function. Multiprotein complexes often follow consistent trends in these experiments, which could provide insights about their biology. Yet, as more experiments are considered, a complex’s signature may become conditional and less identifiable. Previously we successfully distinguished the general proteomic signature of genuine chromosomal proteins from hitchhikers using the Random Forests (RF) machine learning algorithm. Here we test whether small protein complexes can define distinguishable signatures of their own, despite the assumption that machine learning needs large training sets. We show, with simulated and real proteomics data, that RF can detect small protein complexes and relationships between them. We identify several complexes in quantitative proteomics results of wild-type and knockout mitotic chromosomes. Other proteins covary strongly with these complexes, suggesting novel functional links for later study. Integrating the RF analysis for several complexes reveals known interdependences among kinetochore subunits and a novel dependence between the inner kinetochore and condensin. Ribosomal proteins, although identified, remained independent of kinetochore subcomplexes. Together these results show that this complex-oriented RF (NanoRF) approach can integrate proteomics data to uncover subtle protein relationships. Our NanoRF pipeline is available online

    Knowledge-based matrix factorization temporally resolves the cellular responses to IL-6 stimulation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>External stimulations of cells by hormones, cytokines or growth factors activate signal transduction pathways that subsequently induce a re-arrangement of cellular gene expression. The analysis of such changes is complicated, as they consist of multi-layered temporal responses. While classical analyses based on clustering or gene set enrichment only partly reveal this information, matrix factorization techniques are well suited for a detailed temporal analysis. In signal processing, factorization techniques incorporating data properties like spatial and temporal correlation structure have shown to be robust and computationally efficient. However, such correlation-based methods have so far not be applied in bioinformatics, because large scale biological data rarely imply a natural order that allows the definition of a delayed correlation function.</p> <p>Results</p> <p>We therefore develop the concept of graph-decorrelation. We encode prior knowledge like transcriptional regulation, protein interactions or metabolic pathways in a weighted directed graph. By linking features along this underlying graph, we introduce a partial ordering of the features (e.g. genes) and are thus able to define a graph-delayed correlation function. Using this framework as constraint to the matrix factorization task allows us to set up the fast and robust graph-decorrelation algorithm (GraDe). To analyze alterations in the gene response in <it>IL-6 </it>stimulated primary mouse hepatocytes, we performed a time-course microarray experiment and applied GraDe. In contrast to standard techniques, the extracted time-resolved gene expression profiles showed that <it>IL-6 </it>activates genes involved in cell cycle progression and cell division. Genes linked to metabolic and apoptotic processes are down-regulated indicating that <it>IL-6 </it>mediated priming renders hepatocytes more responsive towards cell proliferation and reduces expenditures for the energy metabolism.</p> <p>Conclusions</p> <p>GraDe provides a novel framework for the decomposition of large-scale 'omics' data. We were able to show that including prior knowledge into the separation task leads to a much more structured and detailed separation of the time-dependent responses upon <it>IL-6 </it>stimulation compared to standard methods. A Matlab implementation of the GraDe algorithm is freely available at <url>http://cmb.helmholtz-muenchen.de/grade</url>.</p

    “Topological Significance” Analysis of Gene Expression and Proteomic Profiles from Prostate Cancer Cells Reveals Key Mechanisms of Androgen Response

    Get PDF
    The problem of prostate cancer progression to androgen independence has been extensively studied. Several studies systematically analyzed gene expression profiles in the context of biological networks and pathways, uncovering novel aspects of prostate cancer. Despite significant research efforts, the mechanisms underlying tumor progression are poorly understood. We applied a novel approach to reconstruct system-wide molecular events following stimulation of LNCaP prostate cancer cells with synthetic androgen and to identify potential mechanisms of androgen-independent progression of prostate cancer.We have performed concurrent measurements of gene expression and protein levels following the treatment using microarrays and iTRAQ proteomics. Sets of up-regulated genes and proteins were analyzed using our novel concept of "topological significance". This method combines high-throughput molecular data with the global network of protein interactions to identify nodes which occupy significant network positions with respect to differentially expressed genes or proteins. Our analysis identified the network of growth factor regulation of cell cycle as the main response module for androgen treatment in LNCap cells. We show that the majority of signaling nodes in this network occupy significant positions with respect to the observed gene expression and proteomic profiles elicited by androgen stimulus. Our results further indicate that growth factor signaling probably represents a "second phase" response, not directly dependent on the initial androgen stimulus.We conclude that in prostate cancer cells the proliferative signals are likely to be transmitted from multiple growth factor receptors by a multitude of signaling pathways converging on several key regulators of cell proliferation such as c-Myc, Cyclin D and CREB1. Moreover, these pathways are not isolated but constitute an interconnected network module containing many alternative routes from inputs to outputs. If the whole network is involved, a precisely formulated combination therapy may be required to fight the tumor growth effectively
    corecore